A Kernel Approach to Comparing Distributions
نویسندگان
چکیده
We describe a technique for comparing distributions without the need for density estimation as an intermediate step. Our approach relies on mapping the distributions into a Reproducing Kernel Hilbert Space. We apply this technique to construct a two-sample test, which is used for determining whether two sets of observations arise from the same distribution. We use this test in attribute matching for databases using the Hungarian marriage method, where it performs strongly. We also demonstrate excellent performance when comparing distributions over graphs, for which no alternative tests currently exist.
منابع مشابه
THE COMPARISON OF TWO METHOD NONPARAMETRIC APPROACH ON SMALL AREA ESTIMATION (CASE: APPROACH WITH KERNEL METHODS AND LOCAL POLYNOMIAL REGRESSION)
Small Area estimation is a technique used to estimate parameters of subpopulations with small sample sizes. Small area estimation is needed in obtaining information on a small area, such as sub-district or village. Generally, in some cases, small area estimation uses parametric modeling. But in fact, a lot of models have no linear relationship between the small area average and the covariat...
متن کاملComparing the Shape Parameters of Two Weibull Distributions Using Records: A Generalized Inference
The Weibull distribution is a very applicable model for the lifetime data. For inference about two Weibull distributions using records, the shape parameters of the distributions are usually considered equal. However, there is not an appropriate method for comparing the shape parameters in the literature. Therefore, comparing the shape parameters of two Weibull distributions is very important. I...
متن کاملComparison of the Gamma kernel and the orthogonal series methods of density estimation
The standard kernel density estimator suffers from a boundary bias issue for probability density function of distributions on the positive real line. The Gamma kernel estimators and orthogonal series estimators are two alternatives which are free of boundary bias. In this paper, a simulation study is conducted to compare small-sample performance of the Gamma kernel estimators and the orthog...
متن کاملMODELING OF FLOW NUMBER OF ASPHALT MIXTURES USING A MULTI–KERNEL BASED SUPPORT VECTOR MACHINE APPROACH
Flow number of asphalt–aggregate mixtures as an explanatory factor has been proposed in order to assess the rutting potential of asphalt mixtures. This study proposes a multiple–kernel based support vector machine (MK–SVM) approach for modeling of flow number of asphalt mixtures. The MK–SVM approach consists of weighted least squares–support vector machine (WLS–SVM) integrating two kernel funct...
متن کاملKernel Maximum Mean Discrepancy for Region Merging Approach
Kernel methods are becoming increasingly challenging for use in a wide variety of computer vision applications. This paper introduces the use of Kernel MaximumMean Discrepancy (KMMD) for region merging process. KMMD is a recent unsupervised kernel-based method commonly used in analysing and comparing distributions. We propose a region merging approach based on the KMMD framework which aims at i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007